Search CORE

17 research outputs found

Absolute Convergence of Rational Series is Semi-decidable

Author: A. Salomaa
C. Cortes
C. Cortes
F. Denis
F. Denis
J. Berstel
R.B. Lyngsø
Publication venue: HAL CCSD
Publication date: 01/01/2009
Field of study

International audienceWe study \emph{real-valued absolutely convergent rational series}, i.e. functions

r: \Sigma^* \rightarrow {\mathbb R}

, defined over a free monoid

\Sigma^*

, that can be computed by a multiplicity automaton

A

and such that

\sum_{w\in \Sigma^*}|r(w)|<\infty

. We prove that any absolutely convergent rational series

r

can be computed by a multiplicity automaton

A

which has the property that

r_{|A|}

is simply convergent, where

r_{|A|}

is the series computed by the automaton

|A|

derived from

A

by taking the absolute values of all its parameters. Then, we prove that the set

{\cal A}^{rat}(\Sigma)

composed of all absolutely convergent rational series is semi-decidable and we show that the sum

\sum_{w\in \Sigma^*}|r(w)|

can be estimated to any accuracy rate for any

r\in {\cal A}^{rat}(\Sigma)

. We also introduce a spectral radius-like parameter

\rho_{|r|}

which satisfies the following property:

r

is absolutely convergent iff

\rho_{|r|}<1

Crossref

HAL AMU

Impact Of The Energy Model On The Complexity Of RNA Folding With Pseudoknots

Author: C. Alkan
C. Liu
C. Theis
C.M. Reidys
E. Bindewald
E. Rivas
H. Yang
J. Zhao
J.E. Tabaska
M. Jiang
M.R. Garey
M.V. Ashley
R. Nussinov
R.B. Lyngsø
R.B. Lyngsø
S. Griffiths-Jones
S. Ieong
T. Akutsu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

International audiencePredicting the folding of an RNA sequence, while allowing general pseudoknots (PK), consists in finding a minimal free-energy matching of its

n

positions. Assuming independently contributing base-pairs, the problem can be solved in

\Theta(n^3)

-time using a variant of the maximal weighted matching. By contrast, the problem was previously proven NP-Hard in the more realistic nearest-neighbor energy model. In this work, we consider an intermediate model, called the stacking-pairs energy model. We extend a result by Lyngs\o, showing that RNA folding with PK is NP-Hard within a large class of parametrization for the model. We also show the approximability of the problem, by giving a practical

\Theta(n^3)

algorithm that achieves at least a

5

-approximation for any parametrization of the stacking model. This contrasts nicely with the nearest-neighbor version of the problem, which we prove cannot be approximated within any positive ratio, unless

P=NP

.La prédiction du repliement, avec pseudonoeuds généraux, d'une séquence d'ARN de taille

n

est équivalent à la recherche d'un couplage d'énergie libre minimale. Dans un modèle d'énergie simple, où chaque paire de base contribue indépendamment à l'énergie, ce problème peut être résolu en temps

\Theta(n^3)

grâce à une variante d'un algorithme de couplage pondéré maximal. Cependant, le même problème a été démontré NP-difficile dans le modèle d'énergie dit des plus proches voisins. Dans ce travail, nous étudions les propriétés du problème sous un modèle d'empilements, constituant un modèle intermédiaire entre ceux d'appariement et des plus proches voisins. Nous démontrons tout d'abord que le repliement avec pseudo-noeuds de l'ARN reste NP-difficile dans de nombreuses valuations du modèle d'énergie. . Par ailleurs, nous montrons que ce problème est approximable, en proposant un algorithme polynomial garantissant une

1/5

-approximation. Ce résultat illustre une différence essentielle entre ce modèle et celui des plus proches voisins, pour lequel nous montrons qu'il ne peut être approché à aucun ratio positif par un algorithme en temps polynomial sauf si

N=NP

HAL-CentraleSupelec

Crossref

INRIA a CCSD electronic archive server

Hal-Diderot

HAL-Polytechnique

Inapproximability of maximal strip recovery

Author: C. Zheng
C.H. Papadimitriou
E. Hazan
I. Dinur
J. Akiyama
J. Akiyama
L. Bulteau
L. Wang
M. Chlebík
M. Jiang
M. Jiang
P. Alimonti
R. Bar-Yehuda
R.B. Lyngsø
Z. Chen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

In comparative genomic, the first step of sequence analysis is usually to decompose two or more genomes into syntenic blocks that are segments of homologous chromosomes. For the reliable recovery of syntenic blocks, noise and ambiguities in the genomic maps need to be removed first. Maximal Strip Recovery (MSR) is an optimization problem proposed by Zheng, Zhu, and Sankoff for reliably recovering syntenic blocks from genomic maps in the midst of noise and ambiguities. Given

d

genomic maps as sequences of gene markers, the objective of \msr{d} is to find

d

subsequences, one subsequence of each genomic map, such that the total length of syntenic blocks in these subsequences is maximized. For any constant

d \ge 2

, a polynomial-time 2d-approximation for \msr{d} was previously known. In this paper, we show that for any

d \ge 2

, \msr{d} is APX-hard, even for the most basic version of the problem in which all gene markers are distinct and appear in positive orientation in each genomic map. Moreover, we provide the first explicit lower bounds on approximating \msr{d} for all

d \ge 2

. In particular, we show that \msr{d} is NP-hard to approximate within

\Omega(d/\log d)

. From the other direction, we show that the previous 2d-approximation for \msr{d} can be optimized into a polynomial-time algorithm even if

d

is not a constant but is part of the input. We then extend our inapproximability results to several related problems including \cmsr{d}, \gapmsr{\delta}{d}, and \gapcmsr{\delta}{d}.Comment: A preliminary version of this paper appeared in two parts in the Proceedings of the 20th International Symposium on Algorithms and Computation (ISAAC 2009) and the Proceedings of the 4th International Frontiers of Algorithmics Workshop (FAW 2010

arXiv.org e-Print Archive

Elsevier - Publisher Connector

Crossref

A Combinatorial Framework for Designing (Pseudoknotted) RNA Algorithms

We extend an hypergraph representation, introduced by Finkelstein and Roytberg, to unify dynamic programming algorithms in the context of RNA folding with pseudoknots. Classic applications of RNA dynamic programming energy minimization, partition function, base-pair probabilities...) are reformulated within this framework, giving rise to very simple algorithms. This reformulation allows one to conceptually detach the conformation space/energy model -- captured by the hypergraph model -- from the specific application, assuming unambiguity of the decomposition. To ensure the latter property, we propose a new combinatorial methodology based on generating functions. We extend the set of generic applications by proposing an exact algorithm for extracting generalized moments in weighted distribution, generalizing a prior contribution by Miklos and al. Finally, we illustrate our full-fledged programme on three exemplary conformation spaces (secondary structures, Akutsu's simple type pseudoknots and kissing hairpins). This readily gives sets of algorithms that are either novel or have complexity comparable to classic implementations for minimization and Boltzmann ensemble applications of dynamic programming

arXiv.org e-Print Archive

HAL-CentraleSupelec

CiteSeerX

Crossref

INRIA a CCSD electronic archive server

Hal-Diderot

HAL-Polytechnique

HAL-Rennes 1

Metrics and similarity measures for hidden markow models

Author: Lyngsø R.B.
Nielsen Henrik
Pedersen C.N.S.
Publication venue: AAAI Press
Publication date: 01/01/1999
Field of study

Hidden Markov models were introduced in the beginning of the 1970's as a tool in speech recognition. During the last decade they have been found useful in addressing problems in computational biology such as characterising sequence families, gene nding, structure prediction and phylogenetic analysis. In this paper we propose several measures between hidden Markov models. We give an ecient algorithm that computes the measures for left-right models, e.g. prole hidden Markov models, and briey discuss how to extend the algorithm to other types of models. We present an experiment using the measures to compare hidden Markov models for three classes of signal peptides. Introduction A hidden Markov model describes a probability distribution over a potentially innite set of sequences. It is convenient to think of a hidden Markov model as generating a sequence according to some probability distribution by following a rst order Markov chain of states, called the path, from a sp..

CiteSeerX

Online Research Database In Technology

Approximating the 2-Interval Pattern problem

Author: F. Gavril
G. Blin
I. Dagan
M.C. Golumbic
R.B. Lyngsø
S. Felsner
S. Vialette
T. Akutsu
Publication venue
Publication date: 01/01/2005
Field of study

We address the problem of approximating the 2-Interval Pattern problem over its various models and restrictions. This problem, which is motivated by RNA secondary structure prediction, asks to find a maximum cardinality subset of a 2-interval set with respect to some prespecified model. For each such model, we give varying approximation quality depending on the different possible restrictions imposed on the input 2-interval set

CiteSeerX

Crossref

King's Research Portal

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM

Predicting RNA Secondary Structures: One-grammar-fits-all Solution

Author: A.O. Harmanci
C.M. Reidys
D.P. Giedroc
E. Dam Ten
E. Rivas
H. Chen
I. Brierley
J. Reeder
J. Ren
J.N. Zadeh
K. Lee
M. Hamada
R. Dirks
R.B. Lyngsø
R.B. Lyngsø
S. Zakov
S.H. Bernhart
T. Akutsu.
Y. Tabei
Y. Uemura
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

LNCS v. 9096 entitled: Bioinformatics Research and Applications: 11th International Symposium, ISBRA 2015 Norfolk, USA, June 7-10, 2015 ProceedingsRNA secondary structures are known to be important in many biological processes. Many available programs have been developed for RNA secondary structure prediction. Based on our knowledge, however, there still exist secondary structures of known RNA sequences which cannot be covered by these algorithms. In this paper, we provide an efficient algorithm that can handle all RNA secondary structures found in Rfam database. We designed a new stochastic context-free grammar named Rectangle Tree Grammar (RTG) which significantly expands the classes of structures that can be modelled. Our algorithm runs in O(n 6) time and the accuracy is reasonably high, with average PPV and sensitivity over 75%. In addition, the structures that RTG predicts are very similar to the real ones

Crossref

HKU Scholars Hub

RNA Molecules: Glimpses Through an Algorithmic Lens

Author: A. Condon
A. Xayaphoummine
D.H. Mathews
E. Rivas
J.S. Mattick
J.W. Szostak
M. Andronescu
R.B. Lyngsø
R.M. Dirks
R.M. Dirks
X. Tang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2006
Field of study

Crossref

Learning Stochastic Finite Automata

Author: C. Higuera de la
D. Angluin
D. Angluin
F. Bergadano
L. Pitt
L.G. Valiant
M. Mohri
M.H. Harrison
R.B. Lyngsø
R.C. Carrasco
R.C. Carrasco
Publication venue
Publication date: 01/01/2004
Field of study

Abstract. Stochastic deterministic finite automata have been introduced and are used in a variety of settings. We report here a number of results concerning the learnability of these finite state machines. In the setting of identification in the limit with probability one, we prove that stochastic deterministic finite automata cannot be identified from only a polynomial quantity of data. If concerned with approximation results, they become Pac-learnable if the L ∞ norm is used. We also investigate queries that are sufficient for the class to be learnable.

CiteSeerX

Crossref